Data Mining and Serial Documents
نویسنده
چکیده
This paper is concerned with the investigation of the relevance and suitability of the data mining approach to serial documents. Conceptually the paper is divided into three parts. The first part presents the salient features of data mining and its symbiotic relationship to data warehousing. In the second part of the paper, historical serial documents are introduced, and the Ottoman Tax Registers (Defters) are taken as a case study. Their conformance to the data mining approach is established in terms of structure, analysis and results. A high-level conceptual model for the Defters is also presented. The final part concludes with a brief consideration of the implication of data mining for historical research.
منابع مشابه
خوشهبندی اسناد مبتنی بر آنتولوژی و رویکرد فازی
Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...
متن کاملارائه مدلی برای استخراج اطلاعات از مستندات متنی، مبتنی بر متنکاوی در حوزه یادگیری الکترونیکی
As computer networks become the backbones of science and economy, enormous quantities documents become available. So, for extracting useful information from textual data, text mining techniques have been used. Text Mining has become an important research area that discoveries unknown information, facts or new hypotheses by automatically extracting information from different written documents. T...
متن کاملMining Multiple Text Sequence with Key Management
A Text stream is a sequence of chronologically ordered documents, being generated in various forms. Multiple text streams that are correlated to each other by sharing common topics. Our aim is to extract the knowledge of the text stream from the listed documents. In particular, vulnerabilities could include compromise of data security and loss of information which leads to data leakage. To prov...
متن کاملA review of text mining approaches and their function in discovering and extracting a topic
Background and aim: Four text mining methods are examined and focused on understanding and identifying their properties and limitations in subject discovery. Methodology: The study is an analytical review of the literature of text mining and topic modeling. Findings: LSA could be used to classify specific and unique topics in documents that address only a single topic. The other three text min...
متن کاملXML structural delta mining: Issues and challenges
Recently, there is an increasing research efforts in XML data mining. These research efforts largely assumed that XML documents are static. However, in reality, the documents are rarely static. In this paper, we propose a novel research problem called XML structural delta mining. The objective of XML structural delta mining is to discover knowledge by analyzing structural evolution pattern (als...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Computers and the Humanities
دوره 35 شماره
صفحات -
تاریخ انتشار 2001